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Abstract 

Existing  probabilistic  approaches  to  antomated  reasoning  impose  se¬ 
vere  restrictions  on  its  knowledge  representation  scheme.  Mainly,  this  is 
to  ensure  that  there  exists  an  effective  inferendng  algorithm.  Unfortu¬ 
nately,  this  makes  the  application  of  these  approaches  to  general  domains 
quite  difficult. 

In  this  paper,  we  present  a  new  model  called  Bayesian  multi-networks 
which  uses  a  rule-based  organization  of  knowledge  quite  natural  for  hu¬ 
man  experts  modeling  various  domains.  Furthermore,  strong  probabilistic 
semantics  help  quantify  the  knowledge.  Combined  with  the  rich  structure 
of  rule-based  approaches,  a  genera]  inference  engine  for  Bayesian  multi¬ 
networks  is  developed. 


1  Introduction 

The  success  of  automated  reasoning  will  clearly  depend  on  its  applicability  to 
a  wide  variety  of  problem  domains.  It  must  have  a  flexible  knowledge  repre¬ 
sentation  scheme  as  well  as  provide  effective  and  efficient  inference  mech2misms. 
Unfortunately,  knowledge  representation  and  inference  algorithms  always  seem 
to  be  at  odds  with  one  another.  This  is  peurticularly  true  for  probabilistic  ap¬ 
proaches  which  are  often  trading  off  flexibility  for  efficiency  and  even  effective- 
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ness.  The  major  problem  facing  them,  regardless  of  whether  they  are  being  used 
for  story  understanding,  robotic  planning,  diagnosis,  expert  systems,  etc.,  is  the 
difficulty  of  modeling  the  desired  problem  domain.  More  often  than  not,  severe 
restrictions  are  placed  on  representation  just  to  ensure  that  the  inference  algo¬ 
rithm  employed  at  least  finishes  in  less  than  combinatorial  time  once  in  awhile. 
Of  course,  the  net  result  of  these  restrictions  is  the  increased  frustration  of  try¬ 
ing  to  build  such  knowledge  bases  beyond  the  simplest  of  domains.  The  biggest 
problem  rests  in  re-interpreting  (human)  expert  knowledge  to  fit  the  restricted 
representation  and  quantifying  all  the  required  information  in  some  reasonable 
fashion. 

One  of  the  most  popular  probabilistic  approaches  is  Bayesian  networks  [7]. 
It  offers  a  high  level  of  readability  by  providing  a  clear  graphical  representation 
while  allowing  for  efficient  computations  on  certain  subclasses  of  networks  [7, 
19.  9,  12,  14,  5,  1].  Although  it  has  done  much  for  the  probabilistic  paradigm, 
it  still  falls  far  short  of  the  desired  level  of  knowledge  representation. 

One  such  short  fall  has  been  identified  in  [4,  3]  called  assymeiric  indepen¬ 
dence.  It  asserts  that  random  variables  (abbrev.  r.v.s)  are  dependent  for  some 
but  not  necessarily  all  of  their  values.  Hence,  a  fair  amount  of  redundancy  oc¬ 
curs  in  the  conditional  tables  of  the  network.  By  identifying  them,  it  is  hoped 
that  the  number  of  conditional  probabilities  can  be  reduced  thus  decreasing  the 
computational  time  required. 

In  a  separate  research  effort,  [16,  15]  has  also  identified  a  similar  problem. 
Although  this  work  was  approached  from  the  standpoint  of  solving  the  over- 
specification  problem  when  performing  belief  revision  on  Bayesian  networks,  it 
nevertheless  identifies  that  there  exists  situations  where  a  particular  instantia¬ 
tion  to  a  r.v.  will  render  the  remaining  r.v.  in  the  conditional  “irrelevant” .  In 
other  words,  there  is  little  or  no  change  in  the  final  conditional  probability  when 
the  instantiations  to  the  irrelevant  r.v.  are  altered. 

Although  these  earlier  results  have  served  to  increase  the  expressiveness  of 
Bayesian  networks,  they  are  still  fairly  limited.  If  we  look  carefully  at  both 
approaches,  we  find  that  they  are  still  working  entirely  within  the  realm  of 
Bayesian  network  restrictions.  Simply  put,  these  methods  take  a  Bayesian  net¬ 
work  and  create  a  set  of  smaller  networks  which  are  actually  subnetworks  of  the 
original.  Thus,  they  must  still  for  the  most  part  obey  the  restrictions. 

To  reiterate,  the  most  important  aspects  to  having  a  successful  probabilistic 
model  are: 

•  The  ease  in  which  knowledge  can  be  encoded  and  decoded  by  a  human 
expert. 

•  The  quantifiability  of  the  knowledge  encoded. 

•  The  existence  of  an  effective  and  efficient  inference  engine. 

Probably  the  best  knowledge  representation  scheme  satisfying  the  above  con¬ 
ditions  would  be  the  rule-based  approaches.  [18]  points  out  in  their  milestone 
work  on  MYCIN  that  the  human  experts  have  found  that  working  with  rule- 
based  information  is  much  more  natural  than  working  with  any  other  schemes. 
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Furthermore,  these  rules  are  much  more  easily  quantified  and  provide  a  rich 
structure  for  inferencing. 

In  this  paper,  we  present  a  new  model  called  Bayestan  mulit-nelworks} 
This  model  permits  a  general  rule-based  organizational  scheme  while  providing 
a  strong  probabilistic  semantics  for  the  quantified  rules.  Furthermore,  a  general 
inference  algorithm  can  be  found  for  computing  with  these  multi-networks. 

We  begin  in  Section  2  by  describing  our  Bayesian  multi-networks.  Section  3 
describes  our  inference  algorithm.  Finally,  Section  4  discusses  consistency  con¬ 
siderations  when  constructing  a  Bayesian  multi-network. 


2  Bayesian  Multi-networks 

Bayesian  networks  organize  knowledge  in  terms  of  probabilistic  conditional  de¬ 
pendencies.  The  world  can  be  modeled  by  a  collection  of  events  and  the  relation¬ 
ships  between  them.  In  particular,  both  causal  and  logical  relationships  are  rep¬ 
resented  in  terms  of  conditional  dependencies.  For  example,  given  two  events  A 
and  B  where  event  A  represents  “It  is  raining.”  and  B  represents  “The  sidewalk 
is  wet.”,  the  relationship  that  A  causes  B  is  denoted  by  P{B  =  true  \A  =  true). 
We  say  that  B  is  conditionally  dependent  on  A. 

We  can  redescribe  this  more  clearly  in  terms  of  graphs.  A  Bayesian  network 
is  a  directed  acyclic  graph  whose  nodes  are  r.v.s  and  whose  arcs  between  the 
nodes  represent  direct  conditional  dependencies  between  the  r.v.s.  Given  a  node 
A  with  parents  Si , . . . ,  5„ ,  we  say  that  A  is  conditionally  dependent  on  the  set 
of  r.v.s  Si , . . . ,  S„ .  Furthermore,  Bayesian  networks  assert  that  if  Si , . . . ,  Sn 
are  all  the  parents  of  A,  then 

S(A  =  a|Si  =  6i, . .  .,S„  =  6„)  =  (1) 

S(j4  =  a|Si  =  6i,  .  .  . ,  Bn  =  ini  Gi  —  ^l ,  •  .  . ,  Cm  —  Cnj) 

for  any  collection  of  r.v.s  {Ci, . . . ,  Cm}  such  that 

{Cl , .  .  .  1  Cm}  f”!  {A,  Si,  ... ,  Sn}  =  ^ 

where  $  denotes  the  empty  set  and  for  any  instantiation  to  all  the  r.v.s  involved.^ 
We  will  see  the  significance  of  this  assertion  later  in  this  section. 

The  r.v.s  are  used  to  represent  events  in  the  world.  An  instantiation  of  a 
r.v.  to  some  value  reflects  the  status  of  the  event  being  modeled.  For  example, 
instantiating  A  to  true  above  implies  that  it  is  raining.  Hence,  we  can  represent 
the  various  states  of  the  world  (scenarios)  through  different  instantiations  to 
the  r.v.s.  Furthermore,  we  can  properly  attach  a  probability  to  a  given  scenario 
representing  the  likelihood  of  the  scenario.  The  question  now  arises  as  to  how  we 

'Different  from  the  multinets  in  [3]. 

*Note  that  this  isn’t  quite  true  but  suffices  for  our  discussion.  For  more  information,  see 
the  notion  of  d-separation  in  [7]. 
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go  about  computing  this  probability.  This  is  where  the  above  assertion  comes 
into  play. 

Consider  the  scenario  {Ci  =  ci, . . . ,  Cn  =  c„}  which  instantiates  all  the  r.v.s 
in  the  network.  We  must  somehow  compute  its  joint  probability.  Let’s  assume 
that  {Ci.Cj, . .  .,Cn}  already  are  a  topological  ordering  of  the  r.v.s  consistent 
with  the  associated  directed  acyclic  graph  for  the  Bayesian  network.  We  first 
apply  Bayes’  theorem  and  get 

P(Ci  =  ci....,C„  =  c„)=  (2) 

P(Ci  =  C1IC2  =  C2,  .  .  .,  C„  =  Cn) 

P(C2  =  C2IC3  =  C3,  .  .  . ,  Cn  =  Cn) 

P(C„  =  Cn). 

From  the  assertion  above  on  conditional  dependencies,  we  can  then  substitute 
the  conditional  probabilities  with  smaller  ones  from  (1)  reducing  the  problem 
to  one  of  simply  multiplying  the  appropriate  conditional  probabilities  together 
from  our  network. 

One  of  the  computations  performed  on  Bayesian  networks  is  belief  revision. 
It  is  also  called  the  search  for  the  most-probable  explanation  and  is  characterized 
as  follows:  Given  a  set  of  instantiations  e  called  evidence  on  some  subset  of  the 
r.v.s,  find  an  instantiation  w'  to  all  the  r.v.s  such  that 

P(u'‘|e)  =  m^P(tn|e) 

Furthermore,  belief  revision  is  a  model  for  abductive  reasoning[2].  We  call  the 
instantiation-sets  w  complete  instantiation  sets  or  complete  scenarios.  They  are 
also  called  explanations  for  e. 

Notation.  Given  a  r.v.  A,  R(A)  represents  the  set  of  all  possible  instantiations 
to  A. 

Notation.  Given  a  conditional  probability  P(A  =  o|Si  =  61, . . . ,  P„  =  b„),  we 
define 

head(P(. . .))  —  {A  =  a} 
tail(P(...))  =  {Si=fci,...,B„  =  fc„} 

In  essence,  we  can  simply  describe  a  Bayesian  network  as  a  database  of 
conditional  probabilities  which  satisfy  certain  conditions. 

1.  For  any  P(A  =  a|jBi  =  bi,...,B„  =  6„)  in  the  database,  the  following 
condition  holds: 

P{A  =  a\Bi  =  bu...,B„  =  bn)=  (3) 

P{A  —  a\Bi  =  6j ,  .  .  . ,  Bn  —  bn,  Cj  =  Cj ,  .  .  . ,  Cm  —  Cm) 
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for  any®  {Ci  =  cj , . . . ,  Cm  =  Cm)  such  that 


{Ci,...,Cm}n{>1.5i,...,B„)  =  $. 

2.  For  each  r.v.  A,  there  exists  a  unique  collection  of  r.v.s  {B\, . . . ,  5„}  such 
that  all  conditional  probabilities  in  the  database  are  of  the  form: 

P{A  =  a|Bi  =  bn). 

3.  For  each  r.v.  A  conditioned  on  {Bi, . . . ,  Bn),  for  each  a  €  R{A),bi  € 
R{Bi), . .  .,b„  €  R{Bn), 

P{A  =  a\Bi=bi,...,Bn  =  bn)  (4) 


is  in  the  database. 

4.  We  can  find  a  topological  ordering  on  the  r.v.s  based  on  the  conditional 
probabilities  in  the  database. 

The  first  condition  is  simply  our  conditional  dependency  assertion.  The  sec¬ 
ond  condition  reflects  the  requirement  of  Bayesian  networks  that  they  be  built 
around  r.v.s.  In  particular,  independencies  are  between  r.v.s  and  not  between 
the  different  instantiations  of  the  r.v.s.  The  third  require  that  the  tables  be 
complete  while  the  fourth  requires  our  network  to  be  acyclic  in  nature. 

A  major  representational  restriction  of  Bayesian  networks  arise  from  the 
second  and  fourth  conditions  as  we  mentioned  earlier  in  our  introduction.  Our 
approach  is  to  eliminate  this  restriction  in  order  to  accommodate  a  more  pow¬ 
erful  representational  scheme.  In  particular,  we  wish  to  be  able  to  represent 
the  following  information:  Given  r.v.s  A,B,C,D,  A  =  ai  is  only  conditionally 
dependent  on  B  =  b  and  C  =  ci  whereas  C  =  C2  is  only  conditionally  dependent 
on  A  =  02  and  D  =  d  where  oi  ^  02  and  ci  ^  C2.  These  conditions  correspond 
to  the  rules 


{B  =  6}A{C=Ci}— *{A  =  ai} 

{A  =  a2}A{£»  =  d}^{C  =  C2} 

Clearly,  this  organization  violates  both  the  second  and  fourth  condition  of 
Bayesian  networks. 

We  now  formalize  our  Bayesian  multi-networks. 

Definition  2.1.  A  Bayesian  multi-network  M  is  an  ordered  pair  (V,V)  n)here 
V  is  a  finite  set  of  r.v.s  and  V  is  a  finite  collection  of  conditional  probabilities 
on  V  such  that  for  all  probabilities  P{A  =  a|Bi  =  bi, . . . ,  Bn  =  b„)  €  V , 

P{A  —  a\Bi  —  bi, .  . . ,  Bn  —  bnt  Gj  =  Cj , . .  . ,  Cm  —  Cm)  — 

P(A  =  a|Bi  =  bi,...,Bn  =  b„) 

^Again,  see  the  notion  of  d-separstion. 
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for  any  instantiations  to  {Ci, . . . ,  Cm)  Q  V  and 


{Cl, . . Cm}  n  {A,  Bi, . . . ,  Bn)  —  ^ 


We  have  defined  our  multi-network  in  terms  of  a  database  of  conditional 
probabilities.  Obviously,  given  the  conditional  dependency  restriction,  we  can 
still  calculate  joints  probabilities  similar  to  what  we  did  before  for  Bayesian 
networks.  We  simply  pick  an  appropriate  collection  of  conditional  probabilities 
from  the  database  such  that  it  satisfies  (2).  Intuitively,  we  can  view 

P{A  =  a|Bi  =  bi,...,Bn  =  6„) 

in  the  database  as  the  information  that 

{A  =  a)  is  supported  by  {fli  =  bi, . . . ,  B„  =  bn). 

Hence,  the  collection  of  conditional  probabilities  used  in  calculating  a  joint 
probability  represents  a  causal  inference  chain.  In  a  Bayesian  network,  there  is 
exactly  one  such  chain  which  is  dictated  by  the  directed  acyclic  ordering  of  all 
the  r.v.s.  For  our  multi-networks,  the  ordering  is  asserted  at  a  finer  level  than 
just  between  r.v.s.  It  is  ordered  with  respect  to  particular  instantiations  of  each 
r.v.s.  In  essence,  this  allows  us  to  have  multiple  causal  chains.  Hence,  we  have 
a  multi- network. 

Furthermore,  like  Bayesian  networks,  multi-networks  also  have  a  graphical 
form.  Instead  of  nodes  representing  r.v.s,  they  will  represent  individual  r.v.  in¬ 
stantiations  such  as  {.A  =  a}.  More  specifically,  we  have  two  types  of  nodes  in 
the  multi-network  graph.  The  first  type  represents  r.v.  instantiations  whereas 
the  second  type  is  used  to  explicitly  represent  the  one  or  more  different  ways  a 
particular  r.v.  instantiation  can  be  supported.  The  parents  of  an  instantiation 
node  are  support  nodes.  Each  support  node  corresponds  to  a  conditional  prob¬ 
ability  in  the  database  whose  head  is  the  instantiation  node  and  whose  tail  are 
the  parents  of  the  support  node.  See  Figure  2.1. 

As  we  can  easily  see,  Bayesirm  networks  are  simply  a  special  case  of  Bayesian 
multi-networks.  We  now  briefly  mention  a  special  class  of  multi-networks  whose 
properties  will  be  particularly  useful  for  our  computation  techniques  in  the  next 
sections. 

Definition  2.2.  A  Bayesian  multi-network  M  =  (V,7’)  is  said  to  be  anti- 
cyclic  if  and  only  if  there  exists  a  topological  ordering  on  all  the  individual  r.v. 
instantiations. 

According  to  this  definition,  it  follows  straightforwardly  that  Bayesian  net¬ 
works  are  also  anti-cyclic  multi-networks. 
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Fig.  2.1.  Building  a  graph  for  a  Bayesian  multi-network. 
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3  Computing  Explanations 

We  now  consider  belief  revision  computations  on  our  multi-networks  similar  to 
those  for  normal  Bayesian  networks.  In  Section  2,  given  a  set  of  instantiations  e 
called  evidence,  we  search  for  a  complete  instantiation  set  u*  which  maximizes 

max  P(wje). 

U. 

However,  determining  u*  which  is  the  most-probable  explanation  for  e  on 
Bayesian  networks  requires  that  the  conditional  probability  tables  be  complete 
(See  condition  3  for  Bayesian  networks  in  the  previous  section  )  Existing  meth¬ 
ods  such  as  message-passing  schemes  [7,  19]  rely  on  this  property  to  “localize” 
the  computations  in  the  network. 

Still,  we  can  readily  define  a  best  explanation  for  ar  incomplete  system 
(either  Bayesian  networks  or  multi-networks)  as  follows:  We  say  that  a  com¬ 
plete  instantiation  set  w  is  computable  if  there  exists  a  collection  of  conditional 
probabilities  from  our  database  which  can  be  multiplied  to  compute  the  joint 
probability  from  (2).  This  works  because  of  the  conditional  dependency  asser¬ 
tion  we  placed  on  the  conditional  probabilities  in  our  database.  We  can  then 
say  that  the  best  explanation  is  really  the  most  probable  computable  explanation 
(abbrev.,  mpce). 

Even  if  our  multi-networks  are  probabilisti-  ally  complete,  almost  all  existing 
methods  for  belief  revision  on  Bayesian  networks  are  not  extendable  to  our  new 
model.  We  have  already  mentioned  the  locality  restriction  required  by  message¬ 
passing  schemes  above.  Another  more  serious  restriction  is  acyclicity  of  the 
information  [7,  19,  17].  Our  multi-networks  can  contain  directed  cycles. 

The  one  method  which  can  be  extended  from  Bayesian  networks  to  our  new 
multi-networks  is  Santos’  linear  programming  method  [9,  1,  10,  11].  Basically, 
the  search  for  the  MPCE  is  transformed  into  an  integer  linear  programming 
problem. 

The  transformation  involves  mapping  r.v.  instantiations  into  some  multi¬ 
dimensional  space  which  we  will  denote  by  3?" .  A  subspace  of  9?"  will  represent 
“valid”  instantiations  where  valid  includes  things  like  being  consistent  to  the 
given  evidence  e,  each  r.v.  has  at  most  one  instantiation,  etc.  In  particular,  we 
are  interested  in  transforming  it  into  a  polyhedral  convex  set.^  Such  a  set  can  be 
described  by  a  collection  of  linear  inequalities.  As  it  turns  out,  these  inequalities 
will  intuitively  correspond  to  the  restrict!  ms/constrcunts  required  in  making 
V2ilid  instantiations  of  the  r.v.s.  Fin2dly,  we  would  like  to  define  a  linear  energy 
function  such  that  by  minimizing  it  over  the  convex  set,  the  resulting  answer 
will  be  the  best  explanation  after  we  make  the  appropriate  inverse  mapping. 
Thus,  we  would  have  the  makings  of  a  linear  constraint  satisfaction  problem. 

*  “Polyhedral”  refers  to  the  fact  that  the  boundaries  of  the  subspace  are  composed  of 
hyperplanes. 
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For  the  time  being,  let’s  consider  the  class  of  anli-cyclic  multi-networks.  Let 
j4  be  a  r.v.  and  a  be  a  possible  instantiations  for  A.  If  A  is  instantiated  to 
a.  that  is,  ^  =  a,  then  we  would  like  lo  set  a  real  variable  Aa  to  the  value  1. 
U  A  ^  a,  then  Aa  =  0.  This  holds  for  every  possible  instantiation  of  .4  for 
every  r.v,  A.  Next,  each  r.v.  must  have  exactly  one  instantiation.  This  can  be 
achieved  with  the  linear  constraint 

II  Aa  =  1  (5) 

where  R{A)  is  the  set  of  all  possible  instantiations  for  A. 

Now,  we  must  tie  the  random  variable  instantiations  to  the  conditional  prob¬ 
abilities  in  the  database.  For  each  p  €  'P,  associate  a  real  variable  Qp  called  a 
condthonal  vanablt.  When  is  1,  this  implies  that  the  p  is  being  used  in  the 
joint  probability  computation.  Otherwise,  qp  =  0  implies  it  is  not  being  used. 

Clearly,  if  r.v.  A  is  not  instantiated  to  a,  then  any  conditional  probability 
with  A  =  a  in  either  it’s  head  or  tail  should  not  be  used. 

^'Aa>  ^  (6) 

r€<J(l/»=o}) 

where  K  is  some  arbitrarily  large  constant  and  Q{{A  =  o})  is  the  collection  of 
all  conditional  probabilities  in  V  which  contain  A  =  a. 

Finally,  since  we  know  that  each  r.v.  A  must  be  instantiated  lo  exactly  one 
element,  this  implies  that  exactly  one  of  the  conditional  probabilities  in  V  whose 
head  has  A  must  be  used. 

E  1  (7) 

peH(A) 

where  H{A)  is  the  collection  of  all  conditional  probabilities  in  V  whose  head 
contains  A. 

Taking  all  these  constraints  together,  we  can  prove  the  following  theorems: 

Theorem  3.1.  Any  0-1  assignment  to  the  real  variables  which  satisfy  the  above 
constraints  correspond  to  a  valid  complete  instantiation  of  the  r.v.s  in  a  anti- 
cyclic  multi-network.  Furthermore,  the  corresponding  complete  instantiations 
are  computable. 

Theorem  3.2.  The  converse  of  Theorem  S.l  is  also  true. 

Evidence  is  incorporated  into  the  constraints  by  clamping  the  associated  real 
variables  to  1.  For  example, 

{A  =  a}  €  c  =>  Aa  =  1. 

To  complete  our  transformation,  we  need  to  define  our  objective  function 
we  wish  to  optimize.  Obviously,  we  can  directly  compute  the  joint  probability 
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of  a  complete  instantiation  by  multiplying  the  associated  probabilities  of  the 
conditional  variables  which  have  been  assigned  a  value  of  1.  To  recast  it  as  a 
linear  function  for  our  integer  linear  program,  we  have 

^(-logp)gp.  (8) 

pep 

Theorem  3.3.  A  0-1  assignment  which  satisfies  the  above  constraints  and  mini- 
mues  the  objective  function  is  a  MPCE  for  the  original  anit-cycltc  multi-network. 

Hence,  we  have  transformed  our  problem  for  anti-cyclic  multi-networks  into 
integer  linear  programming. 

We  began  by  looking  at  ar.wi-cyclic  multi-networks  instead  of  the  more  gen¬ 
eral  case  since  the  resulting  constraint  system  is  much  quicker  to  comprehend. 
When  we  have  non-anti-cyclic  multi-networks,  we  can  encounter  the  following 
situation:  Given  r.v.s  A.  B  and  C.  assume  V  only  contains 

P{A  =  a\B  =  6) 

P(B  =  6|C=  c) 

P{C  =  c\A  =  a) 

Clearly,  V  is  incomplete.  However,  if  we  attempt  to  compute 

P{A  =  a,B  =  b,C  =  c) 

using  our  above  method,  we  would  simply  multiply  the  three  probabilities  to¬ 
gether.  This  is  incorrect  since 

P{A  =  a,B  —  b,C  =  c) 

is  not  computable.  A  anti-cyclic  multi-network  forbids  this  type  of  database. 
Unfortunately,  it  also  precludes  databases  of  the  form 

P{A  =  a\B-b,D=di) 

P(B  =  6|C  =  c,Z?  =  di) 

P(C  =c|A  =  a,  T>=d2) 

P(C  =  c|B  =  di) 

P(A  =  a|£>=  dz) 

P(B  =  6|£>=  dj) 

where 

P(A  =  a,B  =  6,C=c.Z?  =  di) 

and 

P{A  =  a,B  =  b,C  =  c,  D  =  d2) 
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are  computable. 

As  it  turns  out,  we  can  solve  this  with  additional  constraints  guarantee  a 
topological  ordering  among  the  instantiations.  We  associate  a  topological  value 
variable  to  each  {A  =  a}.  Basically,  this  variable  will  contain  a  number 
reflecting  a  topological  ordering.  For  each  pair  of  instantiations  {A  =  a]  and 
{5  =  6},  add  the  following  constraints: 

P€M({A=o).{B=6}) 


where  M{{A  =  a),  (5  =  6})  are  all  condition  probabilities  in  V  with 
head  {A  =  a)  and  whose  tail  contains  {5  =  6}  and  K  is  an  arbitrarily 
large  positive  number,  g  will  function  as  a  flag  indicating  that 


{A  =  a)  topologically  dominates  {5  =  6}  since  one  of  the  probabilities  in 
M({A  =  a},  {B  =  6})  has  been  used. 


A'(l -mA^_Bjj)  +  *Aa  -  ^  (10) 

Note  that  our  topological  variables  need  not  be  integer  values.  Since  both 
topological  domin2ince  and  ">”  are  transitive,  these  equations  will  guarantee 
that  any  solution  will  have  a  topological  ordering. 

A  topological  ordering  actually  corresponds  to  a  proof  path.  An  explanation 
for  e  is  a  proof  for  e.  This  is  the  hem  of  abductive  reasoning  [6,  2,  8].  When 
we  have  cyclicity,  we  often  end  up  with  circular  proofs  which  are  invalid  [13]. 
However,  requiring  a  topological  ordering  will  eliminate  these  circular  cases. 


4  Database  Consistency  Checking 

Building  a  Bayesian  multi-network  is  relatively  straightforward  since  any  rule- 
based  construction  approach  is  sufficient  and  our  computational  method  will 
work  on  the  entire  class.  About  the  only  thing  to  take  note  of  while  constructing 
the  probabilistic  database  is  the  issue  of  consistency  between  the  probabilities. 
In  particular,  we  wish  to  avoid  the  possibility  of  having  two  different  collections 
of  conditional  probabilities  that  when  multiplied  together  result  in  different 
values  for  a  single  joint  distribution.  For  example,  consider  the  simple  scenario 
where  we  have  the  r.v.s  A,  B  and  C  and  we  have  the  following  database: 

P{A  -  a\B  =  6) 

5(5  =  6|C  =  c) 

5(C  =  c) 

5(5  =  6|A  =  a) 

5(A  =  a|C=  c) 
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As  we  can  easily  see,  we  can  compute  P{A  =  a,  B  =  b,C  =  c)  in  two  different 
ways. 

We  now  provide  here  a  relatively  straight-forward  algorithm  for  checking 
database  consistency.  It  exploits  our  conditional  dependency  assertion:  For  any 
two  probabilities  P\,P2  in  V,  if  head(pi)  =  head(p2)  and  there  does  not  exist 
a  r.v.  C  in  tail(pi)  and  tail(p2)  such  that  {C  =  cj)  G  tail(pi)  and  {C  =  C2}  G 
tail(p2)  where  ci  C2,  then  the  probabilities  for  p\  and  p2  must  be  the  same. 

Cleeirly,  this  is  a  very  crude  algorithm  and  many  optimizations  can  be  made. 
However,  we  can  prove  that  it  is  sufficient  to  guarantee  that  our  database  is 
probabilistically  consistent  and  can  be  computed  in  0(|  V  P|  V  |)  time. 

5  Conclusion 

Traditional  probabilistic  approaches  to  automated  reasoning  have  suffered  from 
representational  inadequacies  in  order  to  provide  effective  inference  mechanisms. 
In  particular,  the  restrictions  imposed  have  rendered  it  quite  difficult  to  build 
such  knowledge  bases  let  alone  quantify  them. 

In  this  paper,  we  have  developed  a  new  framework  called  Bayesian  multi- 
networks.  This  approach  provides  an  extremely  natural  method  for  organizing 
our  knowledge  through  a  rule-based  scheme.  Little  or  no  restrictions  are  imposed 
that  may  hinder  the  construction  of  such  a  network  for  use  in  almost  any  domain. 
Furthermore,  it  provides  a  graphical  representation  which  permits  an  easier 
visualization  of  the  information  to  the  human  expert. 

Strong  probabilistic  semantics  aid  in  the  quantification  of  information.  Com¬ 
bined  with  the  rich  structure  inherent  in  a  rule-based  approach,  a  general  infer¬ 
ence  engine  was  developed  using  linear  programming  techniques. 

Bayesian  networks  is  one  of  the  most  popular  approaches  for  automated 
reasoning.  Our  new  Bayesian  multi-networks  formulation  subsumes  Bayesian 
networks.  Because  of  this,  Bayesian  networks  are  quite  easily  translated  to  our 
new  model. 

Finally,  recent  work  on  efficient  algorithms  for  Bayesian  network  computa¬ 
tions  using  integer  linear  programming  [11]  seem  to  be  readily  applicable  to  our 
multi-networks.  In  particular,  we  have  observed  that  the  basic  structure  of  the 
linear  programming  formulations  are  similar  in  many  regards. 
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Existing  probabilistic  approaches  to  automated  reasoning  impose  severe  restrictions  on  its  knowledge  represen¬ 
tation  scheme.  Mainly,  this  is  to  ensure  that  there  exists  an  effective  inferencing  algorithm.  Unfortunately,  this 
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of  knowledge  quite  natural  for  human  experts  modeling  various  dommns.  Furthermore,  strong  probabilistic 
semantics  help  quantify  the  knowledge.  Combined  with  the  rich  structure  of  rule-based  approaches,  a  general 
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